15 research outputs found

    Learning nonlinear monotone classifiers using the Choquet Integral

    Get PDF
    In der jüngeren Vergangenheit hat das Lernen von Vorhersagemodellen, die eine monotone Beziehung zwischen Ein- und Ausgabevariablen garantieren, wachsende Aufmerksamkeit im Bereich des maschinellen Lernens erlangt. Besonders für flexible nichtlineare Modelle stellt die Gewährleistung der Monotonie eine große Herausforderung für die Umsetzung dar. Die vorgelegte Arbeit nutzt das Choquet Integral als mathematische Grundlage für die Entwicklung neuer Modelle für nichtlineare Klassifikationsaufgaben. Neben den bekannten Einsatzgebieten des Choquet-Integrals als flexible Aggregationsfunktion in multi-kriteriellen Entscheidungsverfahren, findet der Formalismus damit Eingang als wichtiges Werkzeug für Modelle des maschinellen Lernens. Neben dem Vorteil, Monotonie und Flexibilität auf elegante Weise mathematisch vereinbar zu machen, bietet das Choquet-Integral Möglichkeiten zur Quantifizierung von Wechselwirkungen zwischen Gruppen von Attributen der Eingabedaten, wodurch interpretierbare Modelle gewonnen werden können. In der Arbeit werden konkrete Methoden für das Lernen mit dem Choquet Integral entwickelt, welche zwei unterschiedliche Ansätze nutzen, die Maximum-Likelihood-Schätzung und die strukturelle Risikominimierung. Während der erste Ansatz zu einer Verallgemeinerung der logistischen Regression führt, wird der zweite mit Hilfe von Support-Vektor-Maschinen realisiert. In beiden Fällen wird das Lernproblem imWesentlichen auf die Parameter-Identifikation von Fuzzy-Maßen für das Choquet Integral zurückgeführt. Die exponentielle Anzahl von Freiheitsgraden zur Modellierung aller Attribut-Teilmengen stellt dabei besondere Herausforderungen im Hinblick auf Laufzeitkomplexität und Generalisierungsleistung. Vor deren Hintergrund werden die beiden Ansätze praktisch bewertet und auch theoretisch analysiert. Zudem werden auch geeignete Verfahren zur Komplexitätsreduktion und Modellregularisierung vorgeschlagen und untersucht. Die experimentellen Ergebnisse sind auch für anspruchsvolle Referenzprobleme im Vergleich mit aktuellen Verfahren sehr gut und heben die Nützlichkeit der Kombination aus Monotonie und Flexibilität des Choquet Integrals in verschiedenen Ansätzen des maschinellen Lernens hervor

    Ordinal Choquistic Regression

    Get PDF

    Choquistic Regression: Generalizing Logistic Regression using the Choquet Integral

    Get PDF
    In this paper, we propose a generalization of logistic regression based on the Choquet integral. The basic idea of our approach, referred to as choquistic regression, is to replace the linear function of predictor variables, which is commonly used in logistic regression to model the log odds of the positive class, by the Choquet integral. Thus, it becomes possible to capture non-linear dependencies and interactions among predictor variables while preserving two important properties of logistic regression, namely the comprehensibility of the model and the possibility to ensure its monotonicity in individual predictors. In experimental studies with real and benchmark data, choquistic regression consistently improves upon standard logistic regression in terms of predictive accuracy

    Global, regional, and national cancer incidence, mortality, years of life lost, years lived with disability, and disability-Adjusted life-years for 29 cancer groups, 1990 to 2017 : A systematic analysis for the global burden of disease study

    Get PDF
    Importance: Cancer and other noncommunicable diseases (NCDs) are now widely recognized as a threat to global development. The latest United Nations high-level meeting on NCDs reaffirmed this observation and also highlighted the slow progress in meeting the 2011 Political Declaration on the Prevention and Control of Noncommunicable Diseases and the third Sustainable Development Goal. Lack of situational analyses, priority setting, and budgeting have been identified as major obstacles in achieving these goals. All of these have in common that they require information on the local cancer epidemiology. The Global Burden of Disease (GBD) study is uniquely poised to provide these crucial data. Objective: To describe cancer burden for 29 cancer groups in 195 countries from 1990 through 2017 to provide data needed for cancer control planning. Evidence Review: We used the GBD study estimation methods to describe cancer incidence, mortality, years lived with disability, years of life lost, and disability-Adjusted life-years (DALYs). Results are presented at the national level as well as by Socio-demographic Index (SDI), a composite indicator of income, educational attainment, and total fertility rate. We also analyzed the influence of the epidemiological vs the demographic transition on cancer incidence. Findings: In 2017, there were 24.5 million incident cancer cases worldwide (16.8 million without nonmelanoma skin cancer [NMSC]) and 9.6 million cancer deaths. The majority of cancer DALYs came from years of life lost (97%), and only 3% came from years lived with disability. The odds of developing cancer were the lowest in the low SDI quintile (1 in 7) and the highest in the high SDI quintile (1 in 2) for both sexes. In 2017, the most common incident cancers in men were NMSC (4.3 million incident cases); tracheal, bronchus, and lung (TBL) cancer (1.5 million incident cases); and prostate cancer (1.3 million incident cases). The most common causes of cancer deaths and DALYs for men were TBL cancer (1.3 million deaths and 28.4 million DALYs), liver cancer (572000 deaths and 15.2 million DALYs), and stomach cancer (542000 deaths and 12.2 million DALYs). For women in 2017, the most common incident cancers were NMSC (3.3 million incident cases), breast cancer (1.9 million incident cases), and colorectal cancer (819000 incident cases). The leading causes of cancer deaths and DALYs for women were breast cancer (601000 deaths and 17.4 million DALYs), TBL cancer (596000 deaths and 12.6 million DALYs), and colorectal cancer (414000 deaths and 8.3 million DALYs). Conclusions and Relevance: The national epidemiological profiles of cancer burden in the GBD study show large heterogeneities, which are a reflection of different exposures to risk factors, economic settings, lifestyles, and access to care and screening. The GBD study can be used by policy makers and other stakeholders to develop and improve national and local cancer control in order to achieve the global targets and improve equity in cancer care. © 2019 American Medical Association. All rights reserved.Peer reviewe

    Global, regional, and national incidence, prevalence, and years lived with disability for 354 diseases and injuries for 195 countries and territories, 1990-2017: a systematic analysis for the Global Burden of Disease Study 2017.

    Get PDF
    The Global Burden of Diseases, Injuries and Risk Factors 2017 includes a comprehensive assessment of incidence, prevalence, and years lived with disability (YLDs) for 354 causes in 195 countries and territories from 1990 to 2017. Previous GBD studies have shown how the decline of mortality rates from 1990 to 2016 has led to an increase in life expectancy, an ageing global population, and an expansion of the non-fatal burden of disease and injury. These studies have also shown how a substantial portion of the world's population experiences non-fatal health loss with considerable heterogeneity among different causes, locations, ages, and sexes. Ongoing objectives of the GBD study include increasing the level of estimation detail, improving analytical strategies, and increasing the amount of high-quality data. METHODS: We estimated incidence and prevalence for 354 diseases and injuries and 3484 sequelae. We used an updated and extensive body of literature studies, survey data, surveillance data, inpatient admission records, outpatient visit records, and health insurance claims, and additionally used results from cause of death models to inform estimates using a total of 68 781 data sources. Newly available clinical data from India, Iran, Japan, Jordan, Nepal, China, Brazil, Norway, and Italy were incorporated, as well as updated claims data from the USA and new claims data from Taiwan (province of China) and Singapore. We used DisMod-MR 2.1, a Bayesian meta-regression tool, as the main method of estimation, ensuring consistency between rates of incidence, prevalence, remission, and cause of death for each condition. YLDs were estimated as the product of a prevalence estimate and a disability weight for health states of each mutually exclusive sequela, adjusted for comorbidity. We updated the Socio-demographic Index (SDI), a summary development indicator of income per capita, years of schooling, and total fertility rate. Additionally, we calculated differences between male and female YLDs to identify divergent trends across sexes. GBD 2017 complies with the Guidelines for Accurate and Transparent Health Estimates Reporting

    Global, regional, and national incidence, prevalence, and years lived with disability for 354 diseases and injuries for 195 countries and territories, 1990-2017: a systematic analysis for the Global Burden of Disease Study 2017

    Get PDF
    The Global Burden of Diseases, Injuries, and Risk Factors Study 2017 (GBD 2017) includes a comprehensive assessment of incidence, prevalence, and years lived with disability (YLDs) for 354 causes in 195 countries and territories from 1990 to 2017. Previous GBD studies have shown how the decline of mortality rates from 1990 to 2016 has led to an increase in life expectancy, an ageing global population, and an expansion of the non-fatal burden of disease and injury. These studies have also shown how a substantial portion of the world's population experiences non-fatal health loss with considerable heterogeneity among different causes, locations, ages, and sexes. Ongoing objectives of the GBD study include increasing the level of estimation detail, improving analytical strategies, and increasing the amount of high-quality data.; We estimated incidence and prevalence for 354 diseases and injuries and 3484 sequelae. We used an updated and extensive body of literature studies, survey data, surveillance data, inpatient admission records, outpatient visit records, and health insurance claims, and additionally used results from cause of death models to inform estimates using a total of 68 781 data sources. Newly available clinical data from India, Iran, Japan, Jordan, Nepal, China, Brazil, Norway, and Italy were incorporated, as well as updated claims data from the USA and new claims data from Taiwan (province of China) and Singapore. We used DisMod-MR 2.1, a Bayesian meta-regression tool, as the main method of estimation, ensuring consistency between rates of incidence, prevalence, remission, and cause of death for each condition. YLDs were estimated as the product of a prevalence estimate and a disability weight for health states of each mutually exclusive sequela, adjusted for comorbidity. We updated the Socio-demographic Index (SDI), a summary development indicator of income per capita, years of schooling, and total fertility rate. Additionally, we calculated differences between male and female YLDs to identify divergent trends across sexes. GBD 2017 complies with the Guidelines for Accurate and Transparent Health Estimates Reporting. Globally, for females, the causes with the greatest age-standardised prevalence were oral disorders, headache disorders, and haemoglobinopathies and haemolytic anaemias in both 1990 and 2017. For males, the causes with the greatest age-standardised prevalence were oral disorders, headache disorders, and tuberculosis including latent tuberculosis infection in both 1990 and 2017. In terms of YLDs, low back pain, headache disorders, and dietary iron deficiency were the leading Level 3 causes of YLD counts in 1990, whereas low back pain, headache disorders, and depressive disorders were the leading causes in 2017 for both sexes combined. All-cause age-standardised YLD rates decreased by 3·9% (95% uncertainty interval [UI] 3·1-4·6) from 1990 to 2017; however, the all-age YLD rate increased by 7·2% (6·0-8·4) while the total sum of global YLDs increased from 562 million (421-723) to 853 million (642-1100). The increases for males and females were similar, with increases in all-age YLD rates of 7·9% (6·6-9·2) for males and 6·5% (5·4-7·7) for females. We found significant differences between males and females in terms of age-standardised prevalence estimates for multiple causes. The causes with the greatest relative differences between sexes in 2017 included substance use disorders (3018 cases [95% UI 2782-3252] per 100 000 in males vs s1400 [1279-1524] per 100 000 in females), transport injuries (3322 [3082-3583] vs 2336 [2154-2535]), and self-harm and interpersonal violence (3265 [2943-3630] vs 5643 [5057-6302]). Global all-cause age-standardised YLD rates have improved only slightly over a period spanning nearly three decades. However, the magnitude of the non-fatal disease burden has expanded globally, with increasing numbers of people who have a wide spectrum of conditions. A subset of conditions has remained globally pervasive since 1990, whereas other conditions have displayed more dynamic trends, with different ages, sexes, and geographies across the globe experiencing varying burdens and trends of health loss. This study emphasises how global improvements in premature mortality for select conditions have led to older populations with complex and potentially expensive diseases, yet also highlights global achievements in certain domains of disease and injury

    Learning nonlinear monotone classifiers using the Choquet Integral

    No full text
    In der jüngeren Vergangenheit hat das Lernen von Vorhersagemodellen, die eine monotone Beziehung zwischen Ein- und Ausgabevariablen garantieren, wachsende Aufmerksamkeit im Bereich des maschinellen Lernens erlangt. Besonders für flexible nichtlineare Modelle stellt die Gewährleistung der Monotonie eine große Herausforderung für die Umsetzung dar. Die vorgelegte Arbeit nutzt das Choquet Integral als mathematische Grundlage für die Entwicklung neuer Modelle für nichtlineare Klassifikationsaufgaben. Neben den bekannten Einsatzgebieten des Choquet-Integrals als flexible Aggregationsfunktion in multi-kriteriellen Entscheidungsverfahren, findet der Formalismus damit Eingang als wichtiges Werkzeug für Modelle des maschinellen Lernens. Neben dem Vorteil, Monotonie und Flexibilität auf elegante Weise mathematisch vereinbar zu machen, bietet das Choquet-Integral Möglichkeiten zur Quantifizierung von Wechselwirkungen zwischen Gruppen von Attributen der Eingabedaten, wodurch interpretierbare Modelle gewonnen werden können. In der Arbeit werden konkrete Methoden für das Lernen mit dem Choquet Integral entwickelt, welche zwei unterschiedliche Ansätze nutzen, die Maximum-Likelihood-Schätzung und die strukturelle Risikominimierung. Während der erste Ansatz zu einer Verallgemeinerung der logistischen Regression führt, wird der zweite mit Hilfe von Support-Vektor-Maschinen realisiert. In beiden Fällen wird das Lernproblem imWesentlichen auf die Parameter-Identifikation von Fuzzy-Maßen für das Choquet Integral zurückgeführt. Die exponentielle Anzahl von Freiheitsgraden zur Modellierung aller Attribut-Teilmengen stellt dabei besondere Herausforderungen im Hinblick auf Laufzeitkomplexität und Generalisierungsleistung. Vor deren Hintergrund werden die beiden Ansätze praktisch bewertet und auch theoretisch analysiert. Zudem werden auch geeignete Verfahren zur Komplexitätsreduktion und Modellregularisierung vorgeschlagen und untersucht. Die experimentellen Ergebnisse sind auch für anspruchsvolle Referenzprobleme im Vergleich mit aktuellen Verfahren sehr gut und heben die Nützlichkeit der Kombination aus Monotonie und Flexibilität des Choquet Integrals in verschiedenen Ansätzen des maschinellen Lernens hervor
    corecore